`

The command also supports regular expressions, anchoring, grouping, and

much more. Use the man grep command to read more about its capabilities.

Filtering with awk

The awk command is a data processing and extraction Swiss-army knife. You

can use it to identify and return specific fields from a file. To see how it works,

take another close look at our log file. What if we needed to print just the IP

addresses from this file? This is easy to do with awk (Listing 2-25).

$ awk '{print $1}' log.txt

Listing 2-25

Printing the first field

The $1 represents the first field of every line in the file, where the IP

addresses are. By default, awk treats spaces or tabs as separators or delimiters.

Using the same syntax, we can print additional fields, such as the timestamps.

Listing 2-26 filters the first three fields of every line in the file.

$ awk '{print $1,$2,$3}' log.txt

Listing 2-26

Printing the first three fields

Using similar syntax, we can print the first and last field simultaneously. In

this case, NF represents the last field (Listing 2-27).

$ awk '{print $1,$NF}' log.txt

Listing 2-27

Printing the first and last field

We can also change the default delimiter. For example, if we had a CSV file

separated by commas, rather than spaces or tabs, we could pass awk the -F flag to

specify the type of delimiter, as in Listing 2-28.

$ awk -F',' '{print $1}' example_csv.txt

Listing 2-28

Printing the first field using a comma delimiter

We can even use awk to print the first 10 lines of some file. This emulates the

behavior of the head Linux command. NR represents the total number of records

and is built into awk (Listing 2-29).

$ awk 'NR < 10' log.txt

Listing 2-29

Printing the first 10 lines of a file

You’ll often find it useful to combine grep and awk. For example, you

might want to first find the lines in a file containing the IP address 42.236.10.117

and then print the HTTP paths this IP made a request to (Listing 2-30).

$ grep "42.236.10.117" log.txt | awk '{print $7}'

Listing 2-30

Filtering an IP address and printing the seventh field, representing HTTP paths

Black Hat Bash (Early Access) © 2023 by Dolev Farhi and Nick Aleks